Search CORE

3 research outputs found

PoCL-R: An Open Standard Based Offloading Layer for Heterogeneous Multi-Access Edge Computing with Server Side Scalability

Author: Babej Michal
Ikkala Julius
Jääskeläinen Pekka
Solanti Jan
Publication venue
Publication date: 01/09/2023
Field of study

We propose a novel computing runtime that exposes remote compute devices via the cross-vendor open heterogeneous computing standard OpenCL and can execute compute tasks on the MEC cluster side across multiple servers in a scalable manner. Intermittent UE connection loss is handled gracefully even if the device's IP address changes on the way. Network-induced latency is minimized by transferring data and signaling command completions between remote devices in a peer-to-peer fashion directly to the target server with a streamlined TCP-based protocol that yields a command latency of only 60 microseconds on top of network round-trip latency in synthetic benchmarks. The runtime can utilize RDMA to speed up inter-server data transfers by an additional 60% compared to the TCP-based solution. The benefits of the proposed runtime in MEC applications are demonstrated with a smartphone-based augmented reality rendering case study. Measurements show up to 19x improvements to frame rate and 17x improvements to local energy consumption when using the proposed runtime to offload AR rendering from a smartphone. Scalability to multiple GPU servers in real-world applications is shown in a computational fluid dynamics simulation, which scales with the number of servers at roughly 80% efficiency which is comparable to an MPI port of the same simulation.Comment: 13 pages, 17 figure

arXiv.org e-Print Archive

PoCL-R : A Scalable Low Latency Distributed OpenCL Runtime

Author: Babej Michal
Ikkala Julius
Jääskeläinen Pekka
Malamal Vadakital Vinod Kumar
Solanti Jan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/07/2021
Field of study

Offloading the most demanding parts of applications to an edge GPU server cluster to save power or improve the result quality is a solution that becomes increasingly realistic with new networking technologies. In order to make such a computing scheme feasible, an application programming layer that can provide both low latency and scalable utilization of remote heterogeneous computing resources is needed. To this end, we propose a latency-optimized scalable distributed heterogeneous computing runtime implementing the standard OpenCL API. In the proposed runtime, network-induced latency is reduced by means of peer-to-peer data transfers and event synchronization as well as a streamlined control protocol implementation. Further improvements can be obtained streaming of source data directly from the producer device to the compute cluster. Compute cluster scalability is improved by distributing the command and event processing responsibilities to remote compute servers. We also show how a simple optional dynamic content size buffer OpenCL extension can significantly speed up applications that utilize variable length data. For evaluation we present a smartphone-based augmented reality rendering case study which, using the runtime, receives 19× improvement in frames per second and 17× improvement in energy per frame when offloading parts of the rendering workload to a nearby GPU server. The remote kernel execution latency overhead of the runtime is only 60 ms on top of the network roundtrip time. The scalability on multi-server multi-GPU clusters is shown with a distributed large matrix multiplication application.acceptedVersionPeer reviewe

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Trepo - Institutional Repository of Tampere University

ANALYZA – Datový sklad

Author: Babej Matej
Batko Michal
Brilla Dávid
Drašar Martin
Guoth Matúš
Kažimír Martin
Macák Martin
Rebok Tomáš
Tovarňák Daniel
Zezula Pavel
Čermák Milan
Publication venue
Publication date: 01/01/2020
Field of study

Softwarová komponenta Datový sklad implementuje úložiště veškerých dat v systému ANALYZA, prakticky uložených ve formě (rozměrných) souborů, objektů a vazeb mezi nimi. Mimo uložení dat určených k analýze představuje Datový sklad i prostor, do kterého je možné ukládat mezivýsledky analytických operací – například v podobě rozšíření stávajících objektů o nové informace, atributy nebo i vytvořením zcela nových objektů či souborů. Hlavní důraz vytvořeného software je kladen na škálovatelnost, spolehlivost, rychlost a flexibilitu celého řešení. Kromě samotného datové skladu obsahuje archiv demonstraci doplňujícího software zajišťujícího propojení více Datových skladů (tzv. proxy komponenta) a demonstraci jejich plnění pomocí uživatelsky přívětivého prostředí.The Data Warehouse software component implements the storage of all data in the ANALZA system, practically stored in the form of (large) files, objects, and links between them. In addition to storing data for analysis, the Data Warehouse also represents a space in which it is possible to store intermediate results of analytical operations – for example, in the form of extending existing objects with new information, attributes, or creating entirely new objects or files. The primary emphasis of the created software is placed on the scalability, reliability, speed, and flexibility of the whole solution. In addition to the data warehouse itself, the archive contains additional demonstrator software ensuring the interconnection of several Data warehouses (proxy component) and demonstrator of data insertion using a user-friendly environment

Univerzitní repozitář Masarykovy univerzity